4 research outputs found

    Eliciting New Wikipedia Users' Interests via Automatically Mined Questionnaires: For a Warm Welcome, Not a Cold Start

    Full text link
    Every day, thousands of users sign up as new Wikipedia contributors. Once joined, these users have to decide which articles to contribute to, which users to seek out and learn from or collaborate with, etc. Any such task is a hard and potentially frustrating one given the sheer size of Wikipedia. Supporting newcomers in their first steps by recommending articles they would enjoy editing or editors they would enjoy collaborating with is thus a promising route toward converting them into long-term contributors. Standard recommender systems, however, rely on users' histories of previous interactions with the platform. As such, these systems cannot make high-quality recommendations to newcomers without any previous interactions -- the so-called cold-start problem. The present paper addresses the cold-start problem on Wikipedia by developing a method for automatically building short questionnaires that, when completed by a newly registered Wikipedia user, can be used for a variety of purposes, including article recommendations that can help new editors get started. Our questionnaires are constructed based on the text of Wikipedia articles as well as the history of contributions by the already onboarded Wikipedia editors. We assess the quality of our questionnaire-based recommendations in an offline evaluation using historical data, as well as an online evaluation with hundreds of real Wikipedia newcomers, concluding that our method provides cohesive, human-readable questions that perform well against several baselines. By addressing the cold-start problem, this work can help with the sustainable growth and maintenance of Wikipedia's diverse editor community.Comment: Accepted at the 13th International AAAI Conference on Web and Social Media (ICWSM-2019

    Quantifying Engagement with Citations on Wikipedia

    Get PDF
    Wikipedia, the free online encyclopedia that anyone can edit, is one of the most visited sites on the Web and a common source of information for many users. As an encyclopedia, Wikipedia is not a source of original information, but was conceived as a gateway to secondary sources: according to Wikipedia's guidelines, facts must be backed up by reliable sources that reflect the full spectrum of views on the topic. Although citations lie at the very heart of Wikipedia, little is known about how users interact with them. To close this gap, we built client-side instrumentation for logging all interactions with links leading from English Wikipedia articles to cited references during one month, and conducted the first analysis of readers' interaction with citations on Wikipedia. We find that overall engagement with citations is low: about one in 300 page views results in a reference click (0.29% overall; 0.56% on desktop; 0.13% on mobile). Matched observational studies of the factors associated with reference clicking reveal that clicks occur more frequently on shorter pages and on pages of lower quality, suggesting that references are consulted more commonly when Wikipedia itself does not contain the information sought by the user. Moreover, we observe that recent content, open access sources and references about life events (births, deaths, marriages, etc) are particularly popular. Taken together, our findings open the door to a deeper understanding of Wikipedia's role in a global information economy where reliability is ever less certain, and source attribution ever more vital.Comment: The Web Conference WWW 2020, 10 page

    Keeping Up with the Trends: Analyzing the Dynamics of Online Learning and Hiring Platforms in the Software Programming Domain

    No full text
    The Fourth Industrial Revolution has considerably sped up the pace of skill changes in many professional domains, with scores of new skills emerging and many old skills moving towards obsolescence. For these domains, identifying the new necessary skills in a timely manner is a difficult task, where existing methods are inadequate. Understanding the process, by which these new skills and technologies appear in and diffuse through a professional domain, could give training providers more time to identify these new skills and react. For this purpose, in the present work, we look at the dynamics between online learning platforms and online hiring platforms in the software programming profession, a rapidly evolving domain. To do so, we fuse four data sources together: Stack Overflow, an online community questions and answers (Q&A) platform; Google Trends, which provides online search trends from Google; Udemy, a platform offering skill-based Massively Open Online Courses (MOOCs) where anyone can create courses; and Stack Overflow Jobs, a job ad platform. We place these platforms along two axes: i) how much expertise it takes, on average, to create content on them, and ii) whether, in general, the decision to create content on them is made by individuals or by groups. Our results show that the topics under study have a systematic tendency to appear earlier on platforms where content creation requires (on average) less expertise and is done more individually, rather than by groups: Stack Overflow is found to be more agile than Udemy, which is itself more agile than Stack Overflow Jobs (Google Trends did not prove usable due to extreme data sparsity). However, our results also show that this tendency is not present for all new skills, and that the software programming profession as a whole is remarkably agile: there are usually only a few months between the first Stack Overflow appearance of a new topic, and its first appearance on Udemy or Stack Overflow Jobs. In addition, we find that Udemy's agility has dramatically increased over time. Our novel methodology is able to provide valuable insights into the dynamics between online education and job ad platforms, enabling training program creators to look at said dynamics for various topics and to understand the pace of change. This allows them to maintain better awareness of the trends and to prioritize their attention, both on the right topics and on the right platforms

    Quantifying Engagement with Citations on Wikipedia

    No full text
    Wikipedia is one of the most visited sites on the Web and a common source of information for many users. As an encyclopedia, Wikipedia was not conceived as a source of original information, but as a gateway to secondary sources: according to Wikipedia's guidelines, facts must be backed up by reliable sources that reflect the full spectrum of views on the topic. Although citations lie at the heart of Wikipedia, little is known about how users interact with them. To close this gap, we built client-side instrumentation for logging all interactions with links leading from English Wikipedia articles to cited references during one month, and conducted the first analysis of readers' interactions with citations. We find that overall engagement with citations is low: about one in 300 page views results in a reference click (0.29% overall; 0.56% on desktop; 0.13% on mobile). Matched observational studies of the factors associated with reference clicking reveal that clicks occur more frequently on shorter pages and on pages of lower quality, suggesting that references are consulted more commonly when Wikipedia itself does not contain the information sought by the user. Moreover, we observe that recent content, open access sources, and references about life events (births, deaths, marriages, etc.) are particularly popular. Taken together, our findings deepen our understanding of Wikipedia's role in a global information economy where reliability is ever less certain, and source attribution ever more vital
    corecore